智能论文笔记

Design and Analysis of Cold Gas Thruster to De-Orbit the PSLV Debris

Roshan Sah , Raunak Srivastava , Kaushik Das

分类：机器人

2022-08-07

如今，太空世界的主要关注点是太空碎片的不受控制的生长及其与航天器碰撞的可能性，尤其是在低地轨道（LEO）区域。本文的目的是设计优化的微螺旋液系统，即冷气油推进器，以将PSLV碎片从668公里到250公里的高度将其除外。推进系统主要由储罐，管道，控制阀和收敛发散的喷嘴组成。本文根据连续的迭代过程给出了每个组件设计的想法，直到满足设计推力要求为止。所有组件均在CATIA V5中设计，并且在每个组件的ANSYS工具中进行了结构分析，我们的气缸箱可以承受其壁上产生的高箍应力。通过使用k-$ \ epsilon $湍流模型进行CD喷嘴的k-$ \ epsilon $回到地球的气氛并燃烧。 Hohmann \的轨道转移方法已被用于除向PSLV空间碎片，并通过STK工具对其进行了模拟。结果表明，我们优化的设计推进器会产生足够的推力，以将PSLV碎片偏离非常低的轨道。

translated by 谷歌翻译

Differentiable Rendering for Pose Estimation in Proximity Operations

Ramchander Rao Bhaskara , Roshan Thomas Eapen , Manoranjan Majji

分类：计算机视觉 | 机器人

2022-12-24

Differentiable rendering aims to compute the derivative of the image rendering function with respect to the rendering parameters. This paper presents a novel algorithm for 6-DoF pose estimation through gradient-based optimization using a differentiable rendering pipeline. We emphasize two key contributions: (1) instead of solving the conventional 2D to 3D correspondence problem and computing reprojection errors, images (rendered using the 3D model) are compared only in the 2D feature space via sparse 2D feature correspondences. (2) Instead of an analytical image formation model, we compute an approximate local gradient of the rendering process through online learning. The learning data consists of image features extracted from multi-viewpoint renders at small perturbations in the pose neighborhood. The gradients are propagated through the rendering pipeline for the 6-DoF pose estimation using nonlinear least squares. This gradient-based optimization regresses directly upon the pose parameters by aligning the 3D model to reproduce a reference image shape. Using representative experiments, we demonstrate the application of our approach to pose estimation in proximity operations.

translated by 谷歌翻译

SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks

Suwon Shon , Siddhant Arora , Chyi-Jiunn Lin , Ankita Pasad , Felix Wu , Roshan Sharma , Wei-Lun Wu , Hung-Yi Lee , Karen Livescu , Shinji Watanabe

分类：自然语言处理

2022-12-20

Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community, but have not received as much attention as lower-level tasks like speech and speaker recognition. In particular, there are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers. Recent work has begun to introduce such benchmark datasets for several tasks. In this work, we introduce several new annotated SLU benchmark tasks based on freely available speech data, which complement existing benchmarks and address gaps in the SLU evaluation landscape. We contribute four tasks: question answering and summarization involve inference over longer speech sequences; named entity localization addresses the speech-specific task of locating the targeted content in the signal; dialog act classification identifies the function of a given speech utterance. We follow the blueprint of the Spoken Language Understanding Evaluation (SLUE) benchmark suite. In order to facilitate the development of SLU models that leverage the success of pre-trained speech representations, we will be publishing for each task (i) annotations for a relatively small fine-tuning set, (ii) annotated development and test sets, and (iii) baseline models for easy reproducibility and comparisons. In this work, we present the details of data collection and annotation and the performance of the baseline models. We also perform sensitivity analysis of pipeline models' performance (speech recognizer + text model) to the speech recognition accuracy, using more than 20 state-of-the-art speech recognition models.

translated by 谷歌翻译

3rd Continual Learning Workshop Challenge on Egocentric Category and Instance Level Object Understanding

Lorenzo Pellegrini , Chenchen Zhu , Fanyi Xiao , Zhicheng Yan , Antonio Carta , Matthias De Lange , Vincenzo Lomonaco , Roshan Sumbaly , Pau Rodriguez , David Vazquez

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-13

Continual Learning, also known as Lifelong or Incremental Learning, has recently gained renewed interest among the Artificial Intelligence research community. Recent research efforts have quickly led to the design of novel algorithms able to reduce the impact of the catastrophic forgetting phenomenon in deep neural networks. Due to this surge of interest in the field, many competitions have been held in recent years, as they are an excellent opportunity to stimulate research in promising directions. This paper summarizes the ideas, design choices, rules, and results of the challenge held at the 3rd Continual Learning in Computer Vision (CLVision) Workshop at CVPR 2022. The focus of this competition is the complex continual object detection task, which is still underexplored in literature compared to classification tasks. The challenge is based on the challenge version of the novel EgoObjects dataset, a large-scale egocentric object dataset explicitly designed to benchmark continual learning algorithms for egocentric category-/instance-level object understanding, which covers more than 1k unique main objects and 250+ categories in around 100k video frames.

translated by 谷歌翻译

"I think this is the most disruptive technology": Exploring Sentiments of ChatGPT Early Adopters using Twitter Data

Mubin Ul Haque , Isuru Dharmadasa , Zarrin Tasnim Sworna , Roshan Namal Rajapakse , Hussain Ahmad

分类：自然语言处理

2022-12-12

Large language models have recently attracted significant attention due to their impressive performance on a variety of tasks. ChatGPT developed by OpenAI is one such implementation of a large, pre-trained language model that has gained immense popularity among early adopters, where certain users go to the extent of characterizing it as a disruptive technology in many domains. Understanding such early adopters' sentiments is important because it can provide insights into the potential success or failure of the technology, as well as its strengths and weaknesses. In this paper, we conduct a mixed-method study using 10,732 tweets from early ChatGPT users. We first use topic modelling to identify the main topics and then perform an in-depth qualitative sentiment analysis of each topic. Our results show that the majority of the early adopters have expressed overwhelmingly positive sentiments related to topics such as Disruptions to software development, Entertainment and exercising creativity. Only a limited percentage of users expressed concerns about issues such as the potential for misuse of ChatGPT, especially regarding topics such as Impact on educational aspects. We discuss these findings by providing specific examples for each topic and then detail implications related to addressing these concerns for both researchers and users.

translated by 谷歌翻译

Five Properties of Specific Curiosity You Didn't Know Curious Machines Should Have

Nadia M. Ady , Roshan Shariff , Johannes Günther , Patrick M. Pilarski

分类：人工智能 | 机器学习

2022-12-01

Curiosity for machine agents has been a focus of lively research activity. The study of human and animal curiosity, particularly specific curiosity, has unearthed several properties that would offer important benefits for machine learners, but that have not yet been well-explored in machine intelligence. In this work, we conduct a comprehensive, multidisciplinary survey of the field of animal and machine curiosity. As a principal contribution of this work, we use this survey as a foundation to introduce and define what we consider to be five of the most important properties of specific curiosity: 1) directedness towards inostensible referents, 2) cessation when satisfied, 3) voluntary exposure, 4) transience, and 5) coherent long-term learning. As a second main contribution of this work, we show how these properties may be implemented together in a proof-of-concept reinforcement learning agent: we demonstrate how the properties manifest in the behaviour of this agent in a simple non-episodic grid-world environment that includes curiosity-inducing locations and induced targets of curiosity. As we would hope, our example of a computational specific curiosity agent exhibits short-term directed behaviour while updating long-term preferences to adaptively seek out curiosity-inducing situations. This work, therefore, presents a landmark synthesis and translation of specific curiosity to the domain of machine learning and reinforcement learning and provides a novel view into how specific curiosity operates and in the future might be integrated into the behaviour of goal-seeking, decision-making computational agents in complex environments.

translated by 谷歌翻译

Egocentric Audio-Visual Noise Suppression

Roshan Sharma , Weipeng He , Ju Lin , Egor Lakomkin , Yang Liu , Kaustubh Kalgaonkar

分类：自然语言处理

2022-11-07

This paper studies audio-visual suppression for egocentric videos -- where the speaker is not captured in the video. Instead, potential noise sources are visible on screen with the camera emulating the off-screen speaker's view of the outside world. This setting is different from prior work in audio-visual speech enhancement that relies on lip and facial visuals. In this paper, we first demonstrate that egocentric visual information is helpful for noise suppression. We compare object recognition and action classification based visual feature extractors, and investigate methods to align audio and visual representations. Then, we examine different fusion strategies for the aligned features, and locations within the noise suppression model to incorporate visual information. Experiments demonstrate that visual features are most helpful when used to generate additive correction masks. Finally, in order to ensure that the visual features are discriminative with respect to different noise types, we introduce a multi-task learning framework that jointly optimizes audio-visual noise suppression and video based acoustic event detection. This proposed multi-task framework outperforms the audio only baseline on all metrics, including a 0.16 PESQ improvement. Extensive ablations reveal the improved performance of the proposed model with multiple active distractors, over all noise types and across different SNRs.

translated by 谷歌翻译

XNOR-FORMER: Learning Accurate Approximations in Long Speech Transformers

Roshan Sharma , Bhiksha Raj

分类：自然语言处理 | 人工智能

2022-10-29

Transformers are among the state of the art for many tasks in speech, vision, and natural language processing, among others. Self-attentions, which are crucial contributors to this performance have quadratic computational complexity, which makes training on longer input sequences challenging. Prior work has produced state-of-the-art transformer variants with linear attention, however, current models sacrifice performance to achieve efficient implementations. In this work, we develop a novel linear transformer by examining the properties of the key-query product within self-attentions. Our model outperforms state of the art approaches on speech recognition and speech summarization, resulting in 1 % absolute WER improvement on the Librispeech-100 speech recognition benchmark and a new INTERVIEW speech recognition benchmark, and 5 points on ROUGE for summarization with How2.

translated by 谷歌翻译

ViWiD: Leveraging WiFi for Robust and Resource-Efficient SLAM

Aditya Arun , William Hunter , Roshan Ayyalasomayajula , Dinesh Bharadia

分类：机器人

2022-09-16

对自主导航和室内应用程序勘探机器人的最新兴趣刺激了对室内同时定位和映射（SLAM）机器人系统的研究。尽管大多数这些大满贯系统使用视觉和激光雷达传感器与探针传感器同时使用，但这些探针传感器会随着时间的流逝而漂移。为了打击这种漂移，视觉大满贯系统部署计算和内存密集型搜索算法来检测“环闭合”，这使得轨迹估计在全球范围内保持一致。为了绕过这些资源（计算和内存）密集算法，我们提出了VIWID，该算法将WiFi和视觉传感器集成在双层系统中。这种双层方法将局部和全局轨迹估计的任务分开，从而使VIWID资源有效，同时实现PAR或更好的性能到最先进的视觉大满贯。我们在四个数据集上展示了VIWID的性能，涵盖了超过1500 m的遍历路径，并分别显示出4.3倍和4倍的计算和记忆消耗量与最先进的视觉和LIDAR SLAM SLAM系统相比，具有PAR SLAM性能。

translated by 谷歌翻译

Accelerating Deep Learning Model Inference on Arm CPUs with Ultra-Low Bit Quantization and Runtime

Saad Ashfaq , MohammadHossein AskariHemmat , Sudhakar Sah , Ehsan Saboori , Olivier Mastropietro , Alexander Hoffman

分类：机器学习 | 人工智能

2022-07-18

深度学习一直是近来最具破坏性的技术进步之一。深度学习模型的高性能以高度计算，存储和功率要求为代价。感知到加速和压缩这些模型以提高设备性能的直接需求，我们引入了Deeplite Neutrino，以便对模型的生产优化和Deeplite运行时进行介绍，以在基于ARM的平台上部署超低位量化模型。我们为ARMV7和ARMV8架构实施低级量化内核，可在32位和64位基于ARM的设备上进行部署。通过使用矢量化，并行化和平铺的有效实现，与具有XNNPACK后端的TensorFlow Lite相比，我们在分类和检测模型上分别实现了高达2倍和2.2倍的速度。与ONNX运行时相比，我们还获得了高达5倍和3.2倍的显着加速，分别用于分类和检测模型。

translated by 谷歌翻译